{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Labeling: Excess Return Over Median\n",
    "\n",
    "![image_example](img/distribution_over_median.png)\n",
    "_*Fig. 1:*_ Distribution of excess over median return for 22 stocks from period between Jan 2019 and May 2020.\n",
    "\n",
    "## Abstract\n",
    "\n",
    "In this notebook, we demonstrate labeling financial data according to excess over median. Returns are calculated from cross-sectional data on prices of many different stocks. Each observation is labeled according to whether its return exceeds the median return of all stocks in the given time index. The labels can be given numerically as the value of excess over median, or categorically as the sign of the numerical return. The user can also specify a resample period, and optionally lag the returns to make them forward looking.\n",
    "\n",
    "## Introduction\n",
    "This technique is used in the following paper:\n",
    "[\"The benefits of tree-based models for stock selection\"](https://link.springer.com/article/10.1057/jam.2012.17) by _Zhu et al._ (2012). \n",
    "\n",
    "In the paper, independent composite features are constructed as weighted averages of various parameters in fundamental and quantitative analysis, such as PE ratio, corporate cash flows, debt etc. The composite features are applied as parameters in linear regression or a decision tree to predict whether a stock will outperform the market median return. The authors use monthly forward returns on stock price data.\n",
    "\n",
    "\n",
    "## How it works\n",
    "\n",
    "A dataframe containing stock returns is calculated from prices. The median return of all companies at time $t$ in the dataframe is used to represent the market return, and excess returns are calculated by subtracting the median return from each stock's return over the time period $t$. The numerical returns over median can then be used as-is (for regression analysis), or can be relabeled to represent their signs (for classification analysis).\n",
    "\n",
    "At time $t$:\n",
    "    \\begin{gather*}\n",
    "    P_t = \\{p_{t,0}, p_{t,1}, \\dots, p_{t,n}\\} \\\\\n",
    "    R_t = \\{r_{t,0}, r_{t,1}, \\dots, r_{t,n}\\} \\\\\n",
    "    m_t = median(R_t) \\\\\n",
    "    L(R_t) = \\{r_{t,0} - m_t, r_{t,1} - m_t, \\dots, r_{t,n} - m_t\\}\n",
    "    \\end{gather*}\n",
    "\n",
    "If categorical rather than numerical labels are desired:\n",
    "\n",
    "$$\n",
    "     \\begin{equation}\n",
    "     \\begin{split}\n",
    "       L(r_{t,n}) = \\begin{cases}\n",
    "       -1 &\\ \\text{if} \\ \\ r_{t,n} - m_t < 0\\\\\n",
    "       0 &\\ \\text{if} \\ \\ r_{t,n} - m_t = 0\\\\\n",
    "       1 &\\ \\text{if} \\ \\ r_{t,n} - m_t > 0\\\\\n",
    "       \\end{cases}\n",
    "     \\end{split}\n",
    "     \\end{equation}\n",
    "$$\n",
    "\n",
    "It may be the case that the data is more granular than necessary and the user desires just to find daily or weekly returns. In this case, a resampling period can be inputted, for example 'B' to sample once per business day, and 'W' to sample per week. More details [here](https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html#dateoffset-objects). If a resample period is specified, the returns will be calculated on the only last price in each respective period. The user can lag the returns to make them forward-looking.\n",
    "\n",
    "---\n",
    "## Examples of use"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "import pandas as pd\n",
    "import yfinance as yf\n",
    "\n",
    "from mlfinlab.labeling import excess_over_median\n",
    "\n",
    "import matplotlib.pyplot as plt"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[*********************100%***********************]  22 of 22 completed\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>AMD</th>\n",
       "      <th>MSFT</th>\n",
       "      <th>VZ</th>\n",
       "      <th>C</th>\n",
       "      <th>UBER</th>\n",
       "      <th>AAL</th>\n",
       "      <th>WFC</th>\n",
       "      <th>SYY</th>\n",
       "      <th>JPM</th>\n",
       "      <th>F</th>\n",
       "      <th>...</th>\n",
       "      <th>BABA</th>\n",
       "      <th>FB</th>\n",
       "      <th>ZM</th>\n",
       "      <th>UA</th>\n",
       "      <th>NOK</th>\n",
       "      <th>CCL</th>\n",
       "      <th>PFE</th>\n",
       "      <th>COST</th>\n",
       "      <th>NVDA</th>\n",
       "      <th>CVX</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Date</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>2019-01-22</th>\n",
       "      <td>19.760000</td>\n",
       "      <td>103.568062</td>\n",
       "      <td>54.096535</td>\n",
       "      <td>59.116608</td>\n",
       "      <td>NaN</td>\n",
       "      <td>32.219025</td>\n",
       "      <td>46.484818</td>\n",
       "      <td>60.661575</td>\n",
       "      <td>98.963676</td>\n",
       "      <td>7.837517</td>\n",
       "      <td>...</td>\n",
       "      <td>152.149994</td>\n",
       "      <td>147.570007</td>\n",
       "      <td>NaN</td>\n",
       "      <td>18.590000</td>\n",
       "      <td>5.846427</td>\n",
       "      <td>51.750248</td>\n",
       "      <td>39.946537</td>\n",
       "      <td>209.413116</td>\n",
       "      <td>148.035126</td>\n",
       "      <td>105.162872</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2019-01-23</th>\n",
       "      <td>19.799999</td>\n",
       "      <td>104.577469</td>\n",
       "      <td>54.827435</td>\n",
       "      <td>59.384232</td>\n",
       "      <td>NaN</td>\n",
       "      <td>31.146364</td>\n",
       "      <td>46.727222</td>\n",
       "      <td>60.884239</td>\n",
       "      <td>98.713730</td>\n",
       "      <td>7.689988</td>\n",
       "      <td>...</td>\n",
       "      <td>152.029999</td>\n",
       "      <td>144.300003</td>\n",
       "      <td>NaN</td>\n",
       "      <td>18.400000</td>\n",
       "      <td>5.924772</td>\n",
       "      <td>51.512947</td>\n",
       "      <td>39.842587</td>\n",
       "      <td>209.107468</td>\n",
       "      <td>148.552521</td>\n",
       "      <td>104.273567</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2019-01-24</th>\n",
       "      <td>20.850000</td>\n",
       "      <td>104.077667</td>\n",
       "      <td>54.172474</td>\n",
       "      <td>59.938599</td>\n",
       "      <td>NaN</td>\n",
       "      <td>33.124386</td>\n",
       "      <td>46.596699</td>\n",
       "      <td>60.535717</td>\n",
       "      <td>98.771400</td>\n",
       "      <td>7.929724</td>\n",
       "      <td>...</td>\n",
       "      <td>155.860001</td>\n",
       "      <td>145.830002</td>\n",
       "      <td>NaN</td>\n",
       "      <td>18.860001</td>\n",
       "      <td>6.032495</td>\n",
       "      <td>52.224846</td>\n",
       "      <td>38.699097</td>\n",
       "      <td>207.352509</td>\n",
       "      <td>157.060287</td>\n",
       "      <td>106.258125</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2019-01-25</th>\n",
       "      <td>21.930000</td>\n",
       "      <td>105.028282</td>\n",
       "      <td>53.536488</td>\n",
       "      <td>61.190701</td>\n",
       "      <td>NaN</td>\n",
       "      <td>34.423378</td>\n",
       "      <td>46.736546</td>\n",
       "      <td>60.041988</td>\n",
       "      <td>99.396294</td>\n",
       "      <td>8.169458</td>\n",
       "      <td>...</td>\n",
       "      <td>159.210007</td>\n",
       "      <td>149.009995</td>\n",
       "      <td>NaN</td>\n",
       "      <td>19.490000</td>\n",
       "      <td>6.463387</td>\n",
       "      <td>52.689953</td>\n",
       "      <td>38.406132</td>\n",
       "      <td>206.129929</td>\n",
       "      <td>159.358887</td>\n",
       "      <td>105.986641</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2019-01-28</th>\n",
       "      <td>20.180000</td>\n",
       "      <td>102.980049</td>\n",
       "      <td>52.274014</td>\n",
       "      <td>61.028221</td>\n",
       "      <td>NaN</td>\n",
       "      <td>35.988075</td>\n",
       "      <td>46.447533</td>\n",
       "      <td>60.284012</td>\n",
       "      <td>99.867378</td>\n",
       "      <td>7.985046</td>\n",
       "      <td>...</td>\n",
       "      <td>158.919998</td>\n",
       "      <td>147.470001</td>\n",
       "      <td>NaN</td>\n",
       "      <td>19.370001</td>\n",
       "      <td>6.355664</td>\n",
       "      <td>53.534740</td>\n",
       "      <td>37.357147</td>\n",
       "      <td>207.806046</td>\n",
       "      <td>137.328262</td>\n",
       "      <td>105.003731</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>5 rows × 22 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "                  AMD        MSFT         VZ          C  UBER        AAL  \\\n",
       "Date                                                                       \n",
       "2019-01-22  19.760000  103.568062  54.096535  59.116608   NaN  32.219025   \n",
       "2019-01-23  19.799999  104.577469  54.827435  59.384232   NaN  31.146364   \n",
       "2019-01-24  20.850000  104.077667  54.172474  59.938599   NaN  33.124386   \n",
       "2019-01-25  21.930000  105.028282  53.536488  61.190701   NaN  34.423378   \n",
       "2019-01-28  20.180000  102.980049  52.274014  61.028221   NaN  35.988075   \n",
       "\n",
       "                  WFC        SYY        JPM         F  ...        BABA  \\\n",
       "Date                                                   ...               \n",
       "2019-01-22  46.484818  60.661575  98.963676  7.837517  ...  152.149994   \n",
       "2019-01-23  46.727222  60.884239  98.713730  7.689988  ...  152.029999   \n",
       "2019-01-24  46.596699  60.535717  98.771400  7.929724  ...  155.860001   \n",
       "2019-01-25  46.736546  60.041988  99.396294  8.169458  ...  159.210007   \n",
       "2019-01-28  46.447533  60.284012  99.867378  7.985046  ...  158.919998   \n",
       "\n",
       "                    FB  ZM         UA       NOK        CCL        PFE  \\\n",
       "Date                                                                    \n",
       "2019-01-22  147.570007 NaN  18.590000  5.846427  51.750248  39.946537   \n",
       "2019-01-23  144.300003 NaN  18.400000  5.924772  51.512947  39.842587   \n",
       "2019-01-24  145.830002 NaN  18.860001  6.032495  52.224846  38.699097   \n",
       "2019-01-25  149.009995 NaN  19.490000  6.463387  52.689953  38.406132   \n",
       "2019-01-28  147.470001 NaN  19.370001  6.355664  53.534740  37.357147   \n",
       "\n",
       "                  COST        NVDA         CVX  \n",
       "Date                                            \n",
       "2019-01-22  209.413116  148.035126  105.162872  \n",
       "2019-01-23  209.107468  148.552521  104.273567  \n",
       "2019-01-24  207.352509  157.060287  106.258125  \n",
       "2019-01-25  206.129929  159.358887  105.986641  \n",
       "2019-01-28  207.806046  137.328262  105.003731  \n",
       "\n",
       "[5 rows x 22 columns]"
      ]
     },
     "execution_count": 2,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Load price data for 22 stocks\n",
    "tickers = \"AAPL MSFT COST PFE SYY F GE BABA AMD CCL ZM FB WFC JPM NVDA CVX AAL UBER C UA VZ NOK\"\n",
    "\n",
    "data = yf.download(tickers, start=\"2019-01-20\", end=\"2020-05-25\",\n",
    "                   group_by=\"ticker\")\n",
    "data = data.loc[:, (slice(None), 'Adj Close')]\n",
    "data.columns = data.columns.droplevel(1)\n",
    "data.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Labeling a simple dataframe of prices"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We find the excess return over median for all stocks in the dataset."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>AMD</th>\n",
       "      <th>MSFT</th>\n",
       "      <th>VZ</th>\n",
       "      <th>C</th>\n",
       "      <th>UBER</th>\n",
       "      <th>AAL</th>\n",
       "      <th>WFC</th>\n",
       "      <th>SYY</th>\n",
       "      <th>JPM</th>\n",
       "      <th>F</th>\n",
       "      <th>...</th>\n",
       "      <th>BABA</th>\n",
       "      <th>FB</th>\n",
       "      <th>ZM</th>\n",
       "      <th>UA</th>\n",
       "      <th>NOK</th>\n",
       "      <th>CCL</th>\n",
       "      <th>PFE</th>\n",
       "      <th>COST</th>\n",
       "      <th>NVDA</th>\n",
       "      <th>CVX</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Date</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>2019-01-22</th>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2019-01-23</th>\n",
       "      <td>0.001406</td>\n",
       "      <td>0.009129</td>\n",
       "      <td>0.012893</td>\n",
       "      <td>0.003909</td>\n",
       "      <td>NaN</td>\n",
       "      <td>-0.033911</td>\n",
       "      <td>0.004597</td>\n",
       "      <td>0.003053</td>\n",
       "      <td>-0.003143</td>\n",
       "      <td>-0.019441</td>\n",
       "      <td>...</td>\n",
       "      <td>-0.001406</td>\n",
       "      <td>-0.022777</td>\n",
       "      <td>NaN</td>\n",
       "      <td>-0.010838</td>\n",
       "      <td>0.012783</td>\n",
       "      <td>-0.005203</td>\n",
       "      <td>-0.003220</td>\n",
       "      <td>-0.002077</td>\n",
       "      <td>0.002877</td>\n",
       "      <td>-0.009074</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2019-01-24</th>\n",
       "      <td>0.043061</td>\n",
       "      <td>-0.014748</td>\n",
       "      <td>-0.021915</td>\n",
       "      <td>-0.000634</td>\n",
       "      <td>NaN</td>\n",
       "      <td>0.053538</td>\n",
       "      <td>-0.012762</td>\n",
       "      <td>-0.015693</td>\n",
       "      <td>-0.009385</td>\n",
       "      <td>0.021206</td>\n",
       "      <td>...</td>\n",
       "      <td>0.015223</td>\n",
       "      <td>0.000634</td>\n",
       "      <td>NaN</td>\n",
       "      <td>0.015031</td>\n",
       "      <td>0.008213</td>\n",
       "      <td>0.003851</td>\n",
       "      <td>-0.038669</td>\n",
       "      <td>-0.018362</td>\n",
       "      <td>0.047302</td>\n",
       "      <td>0.009063</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2019-01-25</th>\n",
       "      <td>0.034036</td>\n",
       "      <td>-0.008629</td>\n",
       "      <td>-0.029502</td>\n",
       "      <td>0.003127</td>\n",
       "      <td>NaN</td>\n",
       "      <td>0.021453</td>\n",
       "      <td>-0.014761</td>\n",
       "      <td>-0.025918</td>\n",
       "      <td>-0.011436</td>\n",
       "      <td>0.012470</td>\n",
       "      <td>...</td>\n",
       "      <td>0.003731</td>\n",
       "      <td>0.004044</td>\n",
       "      <td>NaN</td>\n",
       "      <td>0.015642</td>\n",
       "      <td>0.053666</td>\n",
       "      <td>-0.008857</td>\n",
       "      <td>-0.025333</td>\n",
       "      <td>-0.023659</td>\n",
       "      <td>-0.003127</td>\n",
       "      <td>-0.020317</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2019-01-28</th>\n",
       "      <td>-0.070535</td>\n",
       "      <td>-0.010238</td>\n",
       "      <td>-0.014317</td>\n",
       "      <td>0.006609</td>\n",
       "      <td>NaN</td>\n",
       "      <td>0.054719</td>\n",
       "      <td>0.003080</td>\n",
       "      <td>0.013295</td>\n",
       "      <td>0.014004</td>\n",
       "      <td>-0.013309</td>\n",
       "      <td>...</td>\n",
       "      <td>0.007443</td>\n",
       "      <td>-0.001071</td>\n",
       "      <td>NaN</td>\n",
       "      <td>0.003107</td>\n",
       "      <td>-0.007402</td>\n",
       "      <td>0.025297</td>\n",
       "      <td>-0.018049</td>\n",
       "      <td>0.017396</td>\n",
       "      <td>-0.128981</td>\n",
       "      <td>-0.000010</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>5 rows × 22 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "                 AMD      MSFT        VZ         C  UBER       AAL       WFC  \\\n",
       "Date                                                                           \n",
       "2019-01-22       NaN       NaN       NaN       NaN   NaN       NaN       NaN   \n",
       "2019-01-23  0.001406  0.009129  0.012893  0.003909   NaN -0.033911  0.004597   \n",
       "2019-01-24  0.043061 -0.014748 -0.021915 -0.000634   NaN  0.053538 -0.012762   \n",
       "2019-01-25  0.034036 -0.008629 -0.029502  0.003127   NaN  0.021453 -0.014761   \n",
       "2019-01-28 -0.070535 -0.010238 -0.014317  0.006609   NaN  0.054719  0.003080   \n",
       "\n",
       "                 SYY       JPM         F  ...      BABA        FB  ZM  \\\n",
       "Date                                      ...                           \n",
       "2019-01-22       NaN       NaN       NaN  ...       NaN       NaN NaN   \n",
       "2019-01-23  0.003053 -0.003143 -0.019441  ... -0.001406 -0.022777 NaN   \n",
       "2019-01-24 -0.015693 -0.009385  0.021206  ...  0.015223  0.000634 NaN   \n",
       "2019-01-25 -0.025918 -0.011436  0.012470  ...  0.003731  0.004044 NaN   \n",
       "2019-01-28  0.013295  0.014004 -0.013309  ...  0.007443 -0.001071 NaN   \n",
       "\n",
       "                  UA       NOK       CCL       PFE      COST      NVDA  \\\n",
       "Date                                                                     \n",
       "2019-01-22       NaN       NaN       NaN       NaN       NaN       NaN   \n",
       "2019-01-23 -0.010838  0.012783 -0.005203 -0.003220 -0.002077  0.002877   \n",
       "2019-01-24  0.015031  0.008213  0.003851 -0.038669 -0.018362  0.047302   \n",
       "2019-01-25  0.015642  0.053666 -0.008857 -0.025333 -0.023659 -0.003127   \n",
       "2019-01-28  0.003107 -0.007402  0.025297 -0.018049  0.017396 -0.128981   \n",
       "\n",
       "                 CVX  \n",
       "Date                  \n",
       "2019-01-22       NaN  \n",
       "2019-01-23 -0.009074  \n",
       "2019-01-24  0.009063  \n",
       "2019-01-25 -0.020317  \n",
       "2019-01-28 -0.000010  \n",
       "\n",
       "[5 rows x 22 columns]"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "excess1 = excess_over_median(prices=data, lag=False)\n",
    "excess1.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can visualize the distribution as a histogram."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Text(0.5, 1.0, 'Distribution of Return Over Median for 22 Stocks')"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAZYAAAEWCAYAAABFSLFOAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4xLjEsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy8QZhcZAAAgAElEQVR4nO3de7gcVZnv8e+PBEOQS8IkICSBgMQLcDTABjziBQG5CaIzoIhKRJyI4jiOOIegjFyUEUeBGUZHAY0EFCGgaLgoBBCVUSABwiUgsoVIQkISLoEAMZDwnj/Waqh0unt371299+7w+zxPP921VtWqt6ur++26rVJEYGZmVpb1BjoAMzNbtzixmJlZqZxYzMysVE4sZmZWKicWMzMrlROLmZmVyomlZJK+L+nfSmpra0nPShqSh2+S9Kky2s7t/UrSpLLaa2G+X5f0uKTH+nve1jxJ4yWFpKF5uG3ri9cJkDRP0r4DHUcZnFhakD/4FZKWS1om6Q+SjpX08nKMiGMj4mtNttVwJYqIRyJio4hYXULsp0j6cVX7B0bEtL623WIc44DjgR0i4nU16veS9FJOqMslPSDp6Bbav0DS18uMucn5HizpNknPSXpC0k8kje3H+YekxZUkkMuGSloiqZSL1dq1vvS0TvSivc0l/VTSQklPS/pfSXsU6t8n6eb8HX5M0vmSNm7Q3jvyd/1pSU/m9nbLdZ+QdHNfY17XOLG07pCI2BjYBjgDOAH4YdkzKf5ArGO2AZ6IiCUNxlkYERsBmwD/Apwv6Y39EVxvlrukw4CLgf8CRgE7AiuBmyWN7Mf4lgEHFoYPAp4qc/5t0sw6UVOd5bERMAvYFdgMmAZcLWmjXL8p8HVgK+DNwFjgW3Xa3wS4Cvjv3NYY4FTS52v1RIQfTT6AecC+VWW7Ay8BO+XhC4Cv59ejSCvlMuBJ4PekZH5RnmYF8Czw/4DxQADHAI8AvyuUDc3t3QR8A7gNeBr4JbBZrtsLWFArXuAA4AXgxTy/uwrtfSq/Xg84CfgrsAS4ENg011XimJRjexz4SoPltGmefmlu76Tc/r75Pb+U47igxrS13scS4PDC8JuAmXmZPgB8KJdPzu/xhdz+lbk8gO0L0xc/o72ABaQ/CI/lz6ZSdnye9yLg6DrvVfk9/r+q8vWAe4HTgGF5HdipUD86L4vN8/DBwJw83h+At1R9jicAd5N+0IbWiCPycr6sUHY58BUgqj6bH+b39CjpB3ZIrhsCfDt/vg8Bx7H2+ldZX14P3Ag8kcf/CTCiKuYv5ZifBi4FNqgRd811Ang/MDcvj5uAN7eyPGrM5xlg1zp1fw/cU6euC1hWp+7NwN+A1Tn2ZY3W/8J0/wjcDywH7gN2qf59Ia3jDwNH5OET8ue1nLTO7zPQv4cNl/dAB9BJD2okllz+CPCZ/PoCXvnR+gbwfWD9/HgnoFpt8cqP94XAa4Hh1E4sjwI75XF+Bvw41+1FncSSX59SGbdQfxOv/FB8EugGtiP94/s5cFFVbOfnuN6av9BvrrOcLiQlvY3ztH8GjqkXZ9W0L9eTfpzfT/rR2TmXvRaYDxwNDAV2If2w7Vi9/Att9pRYVgHfJCWA4YWy0/LndhDwPDCyRrxvyu1vW6PuVOCP+fVU4PRC3XHAr/PrXUgJbA/Sj/uk/NkNK3yOc4BxwPA6yy3yerEYGJEfi3NZFMb7BXBuXo6bk/6kfDrXHQv8Kc9nM+A31E8s2wPvzctsNOmP0H9WrXu3kbYKNiP9kB7b02eeh98APJfbX5/0x6sbeE2zy6Oq/YmkBLBpnfr/BC6pU7cJKXlOI20Njqyq/wRwcwvr/+Gk7/BupD8l2wPbFL+veX14BDg4l7+RtM5vVfg+vr7dv3d9eXhXWDkWkr481V4EtiStOC9GxO8jrxkNnBIRz0XEijr1F0XEvRHxHPBvwIcqB/f76KPAWRHxUEQ8C5wIHFG1q+HUiFgREXcBd5ESzBpyLB8GToyI5RExDzgT+HgLsWwlaRnpn+wVwBcj4s5cdzAwLyJ+FBGrIuIOUoI9rKV3u6aXgJMjYmVhub8InJY/t2tI/0hr7Y4blZ8X1ahbVKi/GPhIoe7IXAbpH+y5EXFrRKyOdBxjJfC2wvjnRMT8BusFpB/PK0nL/whgRi4DQNIWpB/HL+R1bAlwdh4X4EOk5DA/Ip4k/TGqKSK6I2JmXmZLgbOAd1eNdk5ELMxtXUn6gW/Gh4Grc/svkraihgNvr2q7p+VR2ZV1EWndfbpG/XtJifyrtaaPiGeAd/DKH6ulkmbkZVlrfj2t/58C/iMiZkXSHRF/LTTxTtLnNikirsplq0kJfAdJ60fEvIj4S6P3PdCcWMoxhrRbptq3SP+0rpP0kKQpTbQ1v4X6v5L+0Y2qM24rtsrtFdseChS/QMUzdp4nbdlUGwW8pkZbY1qIZWFEjCD9WzwH2LtQtw2wRz7wuiwnoI8CfTnouzQi/lZV9kRErCoM13u/j+fnLWvUbVmovxEYLmkPSduQfmSvyHXbAMdXvadxpM+koqf1ouJC4Kj8uLCqbhvS+rKoMJ9zSVsu5PlVr1815QPkl0h6VNIzwI9Zez1sZn2pZY11MSJeynEV16Eel4ek4aSEdktErJUkJb2NlNwPi4g/12snIu6PiE9ExFjSFuBWpK2cWnpa/8cBjZLCscAfIuI3hfl3A18g7XVYkpf7VnWmHxScWPoonx0yBljrzJD8j+X4iNgOOAT4oqR9KtV1muxpi2Zc4fXWpH/Wj5N2HWxYiGsIaRdFs+0uJP3wFNteRdqd0orHc0zVbT3aYjtExErSvuX/I+kDuXg+8NuIGFF4bBQRn6lMVqOp5yksG9ZOQn05a+oB0vGYw4uF+UzBfwBuyO/lJWA6aavlSOCqiFheeE+nV72nDSPip72I8fekhLYFa6+T80lbQqMK89kkInbM9YtYe/2q5xs5prdExCbAx0i7dsqwxrooSTmu4jrUcHlIGkba7fco8Oka9TuTtgw+GRE3NBtYRPyJtCt1pzpx9LT+zycdn6rnWGBrSWdXzffiiHhHbjdIu24HLSeWXpK0iaSDgUtIxy7uqTHOwZK2z1+MZ0ibtJVThxeTjme06mOSdpC0IekYwOWRTkf+M7BBPpVyfdIBw2GF6RYD44unRlf5KfAvkrbNZ8/8O3Bp1b/2HuVYpgOnS9o4/zv/Iukfbcsi4gXSroTKroqrgDdI+rik9fNjN0lvzvW1lusc4EhJQyQdwNq7bHot79r8EnCSpCMlDZf0OuAHpC2u4g/ExaTdJB/lld1gkHaxHJu3ZiTptflzrHsKbA/xHAK8v3q3a0QsAq4Dzszr73qSXi+psjymA5+XNDafzdZoC3tj8gFrSWOAf2011gamA++TtE9el48nJcQ/NDNxnuZy0q7Uo3JSL9bvBPwa+KeIuLKHtt4k6fjKqeP51OiPALfkURYDYyW9Bppa/38AfEnSrvmz3j6PU7GcdLLNuySdkef5Rkl752T5t/y++nwJQjs5sbTuSknLSf88vkLat1zvOosJwPWkL+Afgf+JiJty3TdIP0bLJH2phflfRPrH9BiwAfB5gLz/+LOkFfdR0hbMgsJ0l+XnJyTdUaPdqbnt35HORvkb8E8txFX0T3n+D5H+NV+c2++tqaR/cYfkf/n7kY4LLCQth8qBd0hnPO2Ql+svctk/k35sK7vNfkGJIuJS0j70fyH9Y72PdExgz4h4ojDeraTlshXwq0L5bNJxlu+QTg/uJh0U7m08cyNibp3qo0i7au7L87qcV3bjnQ9cSzp+dgfpBI56TiUdZH4auLqHcVsSEQ+QtoD+m7Q8DyGd5v9Ck028nXQsbj9S4ns2P96Z648nbc3/sFBXb3ktJ51Ucauk50gJ5d7cBqRdnHOBxyRVdnvWXf8j4jLg9Fy2nLQurnF8NiKWkU5cOFDS10jr9hl5WTxG2nX55SaXxYBQ9Hgs2czMrHneYjEzs1I5sZiZWamcWMzMrFROLGZmVqp1sqPDUaNGxfjx4wc6DDOzjnL77bc/HhGjex6zsXUysYwfP57Zs2cPdBhmZh1FUt3eFlrRtl1hkjZQuj/FXZLmSjo1l28r6VZJD0q6tHJhkaRhebg7148vtHViLn9A0v7titnMzPquncdYVgJ7R8RbSf0iHZD75vkmcHZETCBdoHVMHv8Y4KmI2J50tfI3ASTtQLoYbkfSFan/o3I6XTQzszZoW2LJPXc+mwcr3cYHqUPBy3P5NKDSB9SheZhcv0/uCuVQUpfWKyPiYdJVybu3K24zM+ubtp4VlvtmmkO618RMUq+eywr9Ty3glV4/x5B7LM31TwN/VyyvMU1xXpMlzZY0e+nSpe14O2Zm1oS2JpZ8b4mJpFt/7k6649pao+XnWj2jRoPy6nmdFxFdEdE1enSfT2owM7Ne6pfrWHKnajeRblw0Qq/cPGosqSNBSFsi4+Dl+1hvSrrHycvlNaYxM7NBpp1nhY2WNCK/Hk665eb9pNudVu72N4l0C0/Id03Lrw8Dbszdfs8g3clwmKRtST0G39auuM3MrG/aeR3LlsC0fAbXesD0iLhK0n3AJZK+DtxJ6uac/HyRpG7SlsoRkLoAlzSd1M33KuC4fM8DMzMbhNbJbvO7urrCF0iambVG0u0R0dXXdtbJK+/NBqPxU65uarx5Z7yvzZGYtZc7oTQzs1I5sZiZWamcWMzMrFROLGZmVionFjMzK5UTi5mZlcqJxczMSuXEYmZmpXJiMTOzUjmxmJlZqZxYzMysVE4sZmZWKicWMzMrlROLmZmVyonFzMxK5cRiZmalcmIxM7NSObGYmVmpnFjMzKxUTixmZlYqJxYzMyuVE4uZmZXKicXMzErlxGJmZqVyYjEzs1K1LbFIGifpN5LulzRX0j/n8lMkPSppTn4cVJjmREndkh6QtH+h/IBc1i1pSrtiNjOzvhvaxrZXAcdHxB2SNgZulzQz150dEd8ujixpB+AIYEdgK+B6SW/I1d8F3gssAGZJmhER97UxdjMz66W2JZaIWAQsyq+XS7ofGNNgkkOBSyJiJfCwpG5g91zXHREPAUi6JI/rxGJmNgj1yzEWSeOBnYFbc9HnJN0taaqkkblsDDC/MNmCXFav3MzMBqG2JxZJGwE/A74QEc8A3wNeD0wkbdGcWRm1xuTRoLx6PpMlzZY0e+nSpaXEbmZmrWtrYpG0Pimp/CQifg4QEYsjYnVEvASczyu7uxYA4wqTjwUWNihfQ0ScFxFdEdE1evTo8t+MmZk1pZ1nhQn4IXB/RJxVKN+yMNoHgXvz6xnAEZKGSdoWmADcBswCJkjaVtJrSAf4Z7QrbjMz65t2nhW2J/Bx4B5Jc3LZl4GPSJpI2p01D/g0QETMlTSddFB+FXBcRKwGkPQ54FpgCDA1Iua2MW4zM+uDdp4VdjO1j49c02Ca04HTa5Rf02g6MzMbPHzlvZmZlcqJxczMSuXEYmZmpXJiMTOzUjmxmJlZqZxYzMysVE4sZmZWKicWMzMrlROLmZmVyonFzMxK5cRiZmalcmIxM7NSObGYmVmpnFjMzKxUTixmZlYqJxYzMyuVE4uZmZXKicXMzErlxGJmZqVyYjEzs1I5sZiZWamcWMzMrFROLGZmVionFjMzK5UTi5mZlcqJxczMSuXEYmZmpWpbYpE0TtJvJN0vaa6kf87lm0maKenB/Dwyl0vSOZK6Jd0taZdCW5Py+A9KmtSumM3MrO/aucWyCjg+It4MvA04TtIOwBTghoiYANyQhwEOBCbkx2Tge5ASEXAysAewO3ByJRmZmdng07bEEhGLIuKO/Ho5cD8wBjgUmJZHmwZ8IL8+FLgwkluAEZK2BPYHZkbEkxHxFDATOKBdcZuZWd/0yzEWSeOBnYFbgS0iYhGk5ANsnkcbA8wvTLYgl9Urr57HZEmzJc1eunRp2W/BzMya1PbEImkj4GfAFyLimUaj1iiLBuVrFkScFxFdEdE1evTo3gVrZmZ91tbEIml9UlL5SUT8PBcvzru4yM9LcvkCYFxh8rHAwgblZmY2CLXzrDABPwTuj4izClUzgMqZXZOAXxbKj8pnh70NeDrvKrsW2E/SyHzQfr9cZmZmg9DQNra9J/Bx4B5Jc3LZl4EzgOmSjgEeAQ7PddcABwHdwPPA0QAR8aSkrwGz8ninRcSTbYzbzMz6oG2JJSJupvbxEYB9aowfwHF12poKTC0vOjMzaxdfeW9mZqVyYjEzs1I5sZiZWamaOsYiaaeIuLfdwZh1mvFTrh7oEMwGnWa3WL4v6TZJn5U0oq0RmZlZR2sqsUTEO4CPki5UnC3pYknvbWtkZmbWkZo+xhIRDwInAScA7wbOkfQnSX/fruDMzKzzNJVYJL1F0tmkHor3Bg7J3eHvDZzdxvjMzKzDNHuB5HeA84EvR8SKSmFELJR0UlsiMzOzjtRsYjkIWBERqwEkrQdsEBHPR8RFbYvOzMw6TrPHWK4HhheGN8xlZmZma2g2sWwQEc9WBvLrDdsTkpmZdbJmE8tzknapDEjaFVjRYHwzM3uVavYYyxeAyyRVbrC1JfDh9oRkZmadrKnEEhGzJL0JeCOpK/w/RcSLbY3MzMw6Uiv3Y9kNGJ+n2VkSEXFhW6IyM7OO1WwnlBcBrwfmAKtzcQBOLGZmtoZmt1i6gB3yXR7NzMzqavassHuB17UzEDMzWzc0u8UyCrhP0m3AykphRLy/LVGZmVnHajaxnNLOIMzMbN3R7OnGv5W0DTAhIq6XtCEwpL2hmZlZJ2q22/x/BC4Hzs1FY4BftCsoMzPrXM0evD8O2BN4Bl6+6dfm7QrKzMw6V7OJZWVEvFAZkDSUdB2LmZnZGppNLL+V9GVgeL7X/WXAle0Ly8zMOlWziWUKsBS4B/g0cA3Q8M6RkqZKWiLp3kLZKZIelTQnPw4q1J0oqVvSA5L2L5QfkMu6JU1p5c2ZmVn/a/assJdItyY+v4W2LyDd0ri625ezI+LbxQJJOwBHADsCWwHXS3pDrv4u8F5gATBL0oyIuK+FOMzMrB8121fYw9Q4phIR29WbJiJ+J2l8k3EcClwSESuBhyV1A7vnuu6IeCjHcUke14nFzGyQaqWvsIoNgMOBzXo5z89JOgqYDRwfEU+RTl++pTDOglwGML+qfI9ajUqaDEwG2HrrrXsZmpmZ9VVTx1gi4onC49GI+E9g717M73ukXpInAouAM3O5as22QXmtGM+LiK6I6Bo9enQvQjMzszI0uytsl8LgeqQtmI1bnVlELC60eT5wVR5cAIwrjDoWqNytsl65mZkNQs3uCjuz8HoVMA/4UKszk7RlRCzKgx8k9ZoMMAO4WNJZpIP3E4DbSFssEyRtCzxKOsB/ZKvzNTOz/tPsWWHvabVhST8F9gJGSVoAnAzsJWkiaXfWPNKpy0TEXEnTSQflVwHHRcTq3M7ngGtJfZNNjYi5rcZiZmb9p9ldYV9sVB8RZ9Uo+0iNUX/YoI3TgdNrlF9Dum7GzMw6QCtnhe1G2mUFcAjwO9Y8Y8vMzKylG33tEhHLIV1BD1wWEZ9qV2BmZtaZmu3SZWvghcLwC8D40qMxM7OO1+wWy0XAbZKuIB14/yBrd9ViZmbW9Flhp0v6FfDOXHR0RNzZvrDMzKxTNbsrDGBD4JmI+C9gQb62xMzMbA3N3pr4ZOAE4MRctD7w43YFZWZmnavZLZYPAu8HngOIiIX0oksXMzNb9zWbWF6IiCB3ACnpte0LyczMOlmziWW6pHOBEZL+Ebie1m76ZWZmrxLNnhX27Xyv+2eANwJfjYiZbY3MzMw6Uo+JRdIQ4NqI2BdwMjEzs4Z63BWWexl+XtKm/RCPmZl1uGavvP8bcI+kmeQzwwAi4vNticrMzDpWs4nl6vwwMzNrqGFikbR1RDwSEdP6KyAzM+tsPR1j+UXlhaSftTkWMzNbB/SUWFR4vV07AzEzs3VDT4kl6rw2MzOrqaeD92+V9Axpy2V4fk0ejojYpK3RmZlZx2mYWCJiSH8FYmZm64ZW7sdiZmbWIycWMzMrlROLmZmVyonFzMxK5cRiZmalcmIxM7NStS2xSJoqaYmkewtlm0maKenB/Dwyl0vSOZK6Jd0taZfCNJPy+A9KmtSueM3MrBzt3GK5ADigqmwKcENETABuyMMABwIT8mMy8D1IiQg4GdgD2B04uZKMzMxscGpbYomI3wFPVhUfClR6Sp4GfKBQfmEktwAjJG0J7A/MjIgnI+Ip0h0sq5OVmZkNIv19jGWLiFgEkJ83z+VjgPmF8Rbksnrla5E0WdJsSbOXLl1aeuBmZtacwXLwXjXKokH52oUR50VEV0R0jR49utTgzMysef2dWBbnXVzk5yW5fAEwrjDeWGBhg3IzMxuk+juxzAAqZ3ZNAn5ZKD8qnx32NuDpvKvsWmA/SSPzQfv9cpmZmQ1Szd7zvmWSfgrsBYyStIB0dtcZwHRJxwCPAIfn0a8BDgK6geeBowEi4klJXwNm5fFOi4jqEwLMzGwQaVtiiYiP1Knap8a4ARxXp52pwNQSQzMzszYaLAfvzcxsHdG2LRYz653xU65uetx5Z7yvjZGY9Y63WMzMrFROLGZmVionFjMzK5UTi5mZlcqJxczMSuXEYmZmpXJiMTOzUjmxmJlZqZxYzMysVE4sZmZWKicWMzMrlROLmZmVyonFzMxK5cRiZmalcmIxM7NSObGYmVmpnFjMzKxUTixmZlYqJxYzMyuVE4uZmZXKicXMzErlxGJmZqVyYjEzs1INHegAzAab8VOuHugQzDragGyxSJon6R5JcyTNzmWbSZop6cH8PDKXS9I5krol3S1pl4GI2czMmjOQu8LeExETI6IrD08BboiICcANeRjgQGBCfkwGvtfvkZqZWdMG0zGWQ4Fp+fU04AOF8gsjuQUYIWnLgQjQzMx6NlCJJYDrJN0uaXIu2yIiFgHk581z+RhgfmHaBblsDZImS5otafbSpUvbGLqZmTUyUAfv94yIhZI2B2ZK+lODcVWjLNYqiDgPOA+gq6trrXozM+sfA7LFEhEL8/MS4Apgd2BxZRdXfl6SR18AjCtMPhZY2H/RmplZK/o9sUh6raSNK6+B/YB7gRnApDzaJOCX+fUM4Kh8dtjbgKcru8zMzGzwGYhdYVsAV0iqzP/iiPi1pFnAdEnHAI8Ah+fxrwEOArqB54Gj+z9kMzNrVr8nloh4CHhrjfIngH1qlAdwXD+EZmZmJRhMpxubmdk6wInFzMxK5cRiZmalcmIxM7NSObGYmVmpnFjMzKxUTixmZlYqJxYzMyuVE4uZmZXKtyY262DN3kZ53hnva3MkZq/wFouZmZXKicXMzErlxGJmZqVyYjEzs1I5sZiZWal8Vpi9ajR7BpWZ9Y23WMzMrFROLGZmVionFjMzK5UTi5mZlcqJxczMSuWzwsxeBVo5I879illfeYvFzMxK5cRiZmalcmIxM7NS+RiLdTRfTV8+H4+xvvIWi5mZlapjtlgkHQD8FzAE+EFEnDHAIVkbeUukM/gOllZLRyQWSUOA7wLvBRYAsyTNiIj7BjYya4WTxauXd6+9unREYgF2B7oj4iEASZcAhwJOLC3wD7t1goFeT53Y+q5TEssYYH5heAGwR3EESZOByXlwpaR7+ym2vhgFPD7QQTTBcZbLcZar1Dj1zbJaWkOnLMs3ltFIpyQW1SiLNQYizgPOA5A0OyK6+iOwvnCc5XKc5XKc5emEGCHFWUY7nXJW2AJgXGF4LLBwgGIxM7MGOiWxzAImSNpW0muAI4AZAxyTmZnV0BG7wiJilaTPAdeSTjeeGhFzG0xyXv9E1meOs1yOs1yOszydECOUFKciouexzMzMmtQpu8LMzKxDOLGYmVmpOjaxSNpM0kxJD+bnkTXGmSjpj5LmSrpb0ocLddtKujVPf2k+KWBA4szj/VrSMklXVZVfIOlhSXPyY+IgjXOwLc9JeZwHJU0qlN8k6YHC8ty8xNgOyG13S5pSo35YXjbdeVmNL9SdmMsfkLR/WTGVGaek8ZJWFJbd9wc4zndJukPSKkmHVdXV/PwHYZyrC8uzrSckNRHnFyXdl38rb5C0TaGuteUZER35AP4DmJJfTwG+WWOcNwAT8uutgEXAiDw8HTgiv/4+8JmBijPX7QMcAlxVVX4BcNhgWJ49xDloliewGfBQfh6ZX4/MdTcBXW2IawjwF2A74DXAXcAOVeN8Fvh+fn0EcGl+vUMefxiwbW5nSJuWX1/iHA/c2+51sYU4xwNvAS4sfkcaff6DKc5c9+wgWp7vATbMrz9T+NxbXp4du8VC6tJlWn49DfhA9QgR8eeIeDC/XggsAUZLErA3cHmj6fsrzhzfDcDyNsXQjF7HOQiX5/7AzIh4MiKeAmYCB7QpnoqXux2KiBeASrdDRcXYLwf2ycvuUOCSiFgZEQ8D3bm9wRZnf+oxzoiYFxF3Ay9VTdufn39f4uxPzcT5m4h4Pg/eQrpeEHqxPDs5sWwREYsA8nPDXRqSdidl6r8Afwcsi4hVuXoBqduYAY+zjtPz5unZkoaVG97L+hLnYFuetboAKsbzo7zr4d9K/MHsaZ5rjJOX1dOkZdfMtGXpS5wA20q6U9JvJb2zTTE2G2c7pm1VX+e1gaTZkm6R1K4/Y9B6nMcAv+rltIP7OhZJ1wOvq1H1lRbb2RK4CJgUES/V+THp9XnXZcVZx4nAY6SkeB5wAnBabxpqY5yDbXk2iuejEfGopI2BnwEfJ+2i6KtmlkG9cUpdfj3oS5yLgK0j4glJuwK/kLRjRDxTdpANYmj3tK3q67y2joiFkrYDbpR0T0T8paTYipqOU9LHgC7g3a1OWzGoE0tE7FuvTtJiSVtGxKKcOJbUGW8T4GrgpIi4JRc/DoyQNDT/I+tTFzFlxNmg7UX55UpJPwK+NAjjHGzLcwGwV2F4LOnYChHxaH5eLuli0i6CMhJLM90OVcZZIGkosCnwZJPTlqXXcUba4b4SICJul/QX0nHMUvqX6kWcjabdq2ram0qJqva8ev3Z5V30RMRDkm4CdibtVSlbU3FK2pf0B+7dEbGyMO1eVdPe1GhmnbwrbAZQOTthEvDL6hGUzky6ArgwIi6rlOOw2KcAAAUySURBVOcvyG+AwxpN319xNpJ/PCvHMT4AtKvX5l7HOQiX57XAfpJGKp01th9wraShkkYBSFofOJjylmcz3Q4VYz8MuDEvuxnAEflsrG2BCcBtJcVVWpySRivdG4n8D3sC6UDuQMVZT83Pf7DFmeMbll+PAvakfbcC6TFOSTsD5wLvj4jiH7bWl2d/nJHQjgdpn+8NwIP5ebNc3kW6wyTAx4AXgTmFx8Rctx3py9sNXAYMG6g48/DvgaXACtI/hP1z+Y3APaQfwB8DGw3SOAfb8vxkjqUbODqXvRa4HbgbmEu+I2mJsR0E/Jn0j/Mruew00hcVYIO8bLrzstquMO1X8nQPAAe2+bvTqziBf8jL7S7gDuCQAY5zt7wOPgc8Acxt9PkPtjiBt+fv9l35+ZgBjvN6YDGv/FbO6O3ydJcuZmZWqk7eFWZmZoOQE4uZmZXKicXMzErlxGJmZqVyYjEzs1I5sVhHKfQGe6+kKyWN6GH8EZI+2+aYdpR0o6Q/595fy+wqpjif8ZJC0tcKZaMkvSjpOy22Na9wTc8fyo7VXt2cWKzTrIiIiRGxE+mq9eN6GH8EqbfellQuBGxivOGkC83OiIg3AG8lXZ/Q52SWr3qv9hDpws6Kw0nXlvRaRLy9L9ObVXNisU72Rwqd4Un6V0mzcoedp+biM4DX562cb0naS4V7yUj6jqRP5NfzJH1V0s3A4Ur3bvmmpNvy1kitThePBP43Iq4DiNQ77OeAKZLWy22OKMyvW9IW+Sr2n+V4Z0naM9efIuk8SddRu6uZFcD9krry8IdJtyyotF+v3b+TdJ1SB5LnUuj/SdKz+Xkjpftw3CHpHkmH5vLxku6XdL7SvY2uywnVrCYnFutIeYtiH3K3FJL2I3UxsjswEdhV0rtI92z5S97K+dcmmv5bRLwjIi7Jw0MjYnfgC8DJNcbfkXQ1/8sidSK4UX78EvhgjnEPYF5ELCZd9X92ROxGuqL9B4UmdgUOjYgj68R4CakLmLHAatbs86leuycDN0fEzqRltnWt9w58MCJ2Id2b48zCLr0JwHcjYkdgWW7brKZB3QmlWQ3DJc0h3TzpdtK9ISD1X7QfcGce3oj0Y/hIi+1fWjX88/x8e55nNVG/p9fI7X0V+BH5plm5bl9gh8KhmE2UelyG1JXGigYx/hr4Gqn7jep467X7LuDvASLiaklP1Xkv/54T8kukrcEtct3DETEnv663LMwAJxbrPCsiYqKkTYGrSMdYziH9KH4jIs4tjqzC7X+zVay5pb5BVf1zVcOVHl5XU/v7Mpf0o12c53akOwMul/RHYHtJo0mdiH49j7Ye8H+rE0hOCNUxrCEiXpB0O3A8aYvpkEJ1o3Z76r/po8BoYNeIeFHSPF5ZPisL460GvCvM6vKuMOtIEfE08HngS0o9FV8LfFLSRgCSxijdz345sHFh0r+S/tEPy8lpnz6G8hPgHUrdjVcO5p9DuoUykTrjuwI4C7g/Ip7I011HOhZDnm5ii/M9Ezih0F5FvXZ/R0ocSDqQdIvZapsCS3JSeQ+wTY1xzHrkxGIdKyLuJPUMe0Q+eH4x8EdJ95Buqbtx/uH933x68rciYj7pYPfdpKRwZ53mm41hBekWrydJeoDUS+0soHj676WknraLu60+D3TlEw3uA45tcb5zI2Jajap67Z4KvEvSHaRdhrV2Ef4kTzublIT+1EpMZhXu3djMzErlLRYzMyuVE4uZmZXKicXMzErlxGJmZqVyYjEzs1I5sZiZWamcWMzMrFT/H5tG/Uy0rulGAAAAAElFTkSuQmCC\n",
      "text/plain": [
       "<Figure size 432x288 with 1 Axes>"
      ]
     },
     "metadata": {
      "needs_background": "light"
     },
     "output_type": "display_data"
    }
   ],
   "source": [
    "s2 = pd.Series(excess1.values.flatten())\n",
    "ax2 = s2.plot.hist(bins=50)\n",
    "ax2.set_xlim(-0.2,0.2)\n",
    "ax2.set_xlabel('Return Over Median')\n",
    "ax2.set_title('Distribution of Return Over Median for 22 Stocks')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Instead of returning the numerical value of excess return over median, we can just return the sign. Using categorical rather than numerical labels alleviates problems that can arise due to extreme outlier returns [Zhu et al. 2012]."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>AMD</th>\n",
       "      <th>MSFT</th>\n",
       "      <th>VZ</th>\n",
       "      <th>C</th>\n",
       "      <th>UBER</th>\n",
       "      <th>AAL</th>\n",
       "      <th>WFC</th>\n",
       "      <th>SYY</th>\n",
       "      <th>JPM</th>\n",
       "      <th>F</th>\n",
       "      <th>...</th>\n",
       "      <th>BABA</th>\n",
       "      <th>FB</th>\n",
       "      <th>ZM</th>\n",
       "      <th>UA</th>\n",
       "      <th>NOK</th>\n",
       "      <th>CCL</th>\n",
       "      <th>PFE</th>\n",
       "      <th>COST</th>\n",
       "      <th>NVDA</th>\n",
       "      <th>CVX</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Date</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>2019-01-22</th>\n",
       "      <td>1.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>-1.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>-1.0</td>\n",
       "      <td>-1.0</td>\n",
       "      <td>...</td>\n",
       "      <td>-1.0</td>\n",
       "      <td>-1.0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>-1.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>-1.0</td>\n",
       "      <td>-1.0</td>\n",
       "      <td>-1.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>-1.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2019-01-23</th>\n",
       "      <td>1.0</td>\n",
       "      <td>-1.0</td>\n",
       "      <td>-1.0</td>\n",
       "      <td>-1.0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>1.0</td>\n",
       "      <td>-1.0</td>\n",
       "      <td>-1.0</td>\n",
       "      <td>-1.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>...</td>\n",
       "      <td>1.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>1.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>-1.0</td>\n",
       "      <td>-1.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>1.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2019-01-24</th>\n",
       "      <td>1.0</td>\n",
       "      <td>-1.0</td>\n",
       "      <td>-1.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>1.0</td>\n",
       "      <td>-1.0</td>\n",
       "      <td>-1.0</td>\n",
       "      <td>-1.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>...</td>\n",
       "      <td>1.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>1.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>-1.0</td>\n",
       "      <td>-1.0</td>\n",
       "      <td>-1.0</td>\n",
       "      <td>-1.0</td>\n",
       "      <td>-1.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2019-01-25</th>\n",
       "      <td>-1.0</td>\n",
       "      <td>-1.0</td>\n",
       "      <td>-1.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>1.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>-1.0</td>\n",
       "      <td>...</td>\n",
       "      <td>1.0</td>\n",
       "      <td>-1.0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>1.0</td>\n",
       "      <td>-1.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>-1.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>-1.0</td>\n",
       "      <td>-1.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2019-01-28</th>\n",
       "      <td>-1.0</td>\n",
       "      <td>-1.0</td>\n",
       "      <td>-1.0</td>\n",
       "      <td>-1.0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>-1.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>...</td>\n",
       "      <td>-1.0</td>\n",
       "      <td>-1.0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>-1.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>-1.0</td>\n",
       "      <td>1.0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>5 rows × 22 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "            AMD  MSFT   VZ    C  UBER  AAL  WFC  SYY  JPM    F  ...  BABA  \\\n",
       "Date                                                            ...         \n",
       "2019-01-22  1.0   1.0  1.0  1.0   NaN -1.0  1.0  1.0 -1.0 -1.0  ...  -1.0   \n",
       "2019-01-23  1.0  -1.0 -1.0 -1.0   NaN  1.0 -1.0 -1.0 -1.0  1.0  ...   1.0   \n",
       "2019-01-24  1.0  -1.0 -1.0  1.0   NaN  1.0 -1.0 -1.0 -1.0  1.0  ...   1.0   \n",
       "2019-01-25 -1.0  -1.0 -1.0  1.0   NaN  1.0  1.0  1.0  1.0 -1.0  ...   1.0   \n",
       "2019-01-28 -1.0  -1.0 -1.0 -1.0   NaN -1.0  1.0  1.0  1.0  1.0  ...  -1.0   \n",
       "\n",
       "             FB  ZM   UA  NOK  CCL  PFE  COST  NVDA  CVX  \n",
       "Date                                                      \n",
       "2019-01-22 -1.0 NaN -1.0  1.0 -1.0 -1.0  -1.0   1.0 -1.0  \n",
       "2019-01-23  1.0 NaN  1.0  1.0  1.0 -1.0  -1.0   1.0  1.0  \n",
       "2019-01-24  1.0 NaN  1.0  1.0 -1.0 -1.0  -1.0  -1.0 -1.0  \n",
       "2019-01-25 -1.0 NaN  1.0 -1.0  1.0 -1.0   1.0  -1.0 -1.0  \n",
       "2019-01-28 -1.0 NaN -1.0  1.0  1.0  1.0   1.0  -1.0  1.0  \n",
       "\n",
       "[5 rows x 22 columns]"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "excess2 = excess_over_median(prices=data, binary=True)\n",
    "excess2.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "-1.0    3629\n",
       " 1.0    3629\n",
       " 0.0      19\n",
       "dtype: int64"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "excess2.stack().value_counts()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Forward labels with resampling"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In the paper, the authors use monthly forward-looking labels. Let's do that here by resampling the data and lagging the returns."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>AMD</th>\n",
       "      <th>MSFT</th>\n",
       "      <th>VZ</th>\n",
       "      <th>C</th>\n",
       "      <th>UBER</th>\n",
       "      <th>AAL</th>\n",
       "      <th>WFC</th>\n",
       "      <th>SYY</th>\n",
       "      <th>JPM</th>\n",
       "      <th>F</th>\n",
       "      <th>...</th>\n",
       "      <th>BABA</th>\n",
       "      <th>FB</th>\n",
       "      <th>ZM</th>\n",
       "      <th>UA</th>\n",
       "      <th>NOK</th>\n",
       "      <th>CCL</th>\n",
       "      <th>PFE</th>\n",
       "      <th>COST</th>\n",
       "      <th>NVDA</th>\n",
       "      <th>CVX</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Date</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>2019-01-31</th>\n",
       "      <td>-1.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>-1.0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>-1.0</td>\n",
       "      <td>-1.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>-1.0</td>\n",
       "      <td>-1.0</td>\n",
       "      <td>...</td>\n",
       "      <td>1.0</td>\n",
       "      <td>-1.0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>1.0</td>\n",
       "      <td>-1.0</td>\n",
       "      <td>-1.0</td>\n",
       "      <td>-1.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>1.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2019-02-28</th>\n",
       "      <td>1.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>-1.0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>-1.0</td>\n",
       "      <td>-1.0</td>\n",
       "      <td>-1.0</td>\n",
       "      <td>-1.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>...</td>\n",
       "      <td>1.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>-1.0</td>\n",
       "      <td>-1.0</td>\n",
       "      <td>-1.0</td>\n",
       "      <td>-1.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>1.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2019-03-31</th>\n",
       "      <td>1.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>-1.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>1.0</td>\n",
       "      <td>-1.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>...</td>\n",
       "      <td>-1.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>1.0</td>\n",
       "      <td>-1.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>-1.0</td>\n",
       "      <td>-1.0</td>\n",
       "      <td>-1.0</td>\n",
       "      <td>-1.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2019-04-30</th>\n",
       "      <td>1.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>-1.0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>-1.0</td>\n",
       "      <td>-1.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>-1.0</td>\n",
       "      <td>-1.0</td>\n",
       "      <td>...</td>\n",
       "      <td>-1.0</td>\n",
       "      <td>-1.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>-1.0</td>\n",
       "      <td>1.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2019-05-31</th>\n",
       "      <td>1.0</td>\n",
       "      <td>-1.0</td>\n",
       "      <td>-1.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>-1.0</td>\n",
       "      <td>-1.0</td>\n",
       "      <td>-1.0</td>\n",
       "      <td>-1.0</td>\n",
       "      <td>...</td>\n",
       "      <td>1.0</td>\n",
       "      <td>-1.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>-1.0</td>\n",
       "      <td>-1.0</td>\n",
       "      <td>-1.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>-1.0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>5 rows × 22 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "            AMD  MSFT   VZ    C  UBER  AAL  WFC  SYY  JPM    F  ...  BABA  \\\n",
       "Date                                                            ...         \n",
       "2019-01-31 -1.0   1.0  1.0 -1.0   NaN -1.0 -1.0  1.0 -1.0 -1.0  ...   1.0   \n",
       "2019-02-28  1.0   1.0  1.0 -1.0   NaN -1.0 -1.0 -1.0 -1.0  1.0  ...   1.0   \n",
       "2019-03-31  1.0   1.0 -1.0  1.0   NaN  1.0 -1.0  1.0  1.0  1.0  ...  -1.0   \n",
       "2019-04-30  1.0   1.0  1.0 -1.0   NaN -1.0 -1.0  1.0 -1.0 -1.0  ...  -1.0   \n",
       "2019-05-31  1.0  -1.0 -1.0  1.0   1.0  1.0 -1.0 -1.0 -1.0 -1.0  ...   1.0   \n",
       "\n",
       "             FB   ZM   UA  NOK  CCL  PFE  COST  NVDA  CVX  \n",
       "Date                                                       \n",
       "2019-01-31 -1.0  NaN  1.0 -1.0 -1.0 -1.0   1.0   1.0  1.0  \n",
       "2019-02-28  1.0  NaN -1.0 -1.0 -1.0 -1.0   1.0   1.0  1.0  \n",
       "2019-03-31  1.0  NaN  1.0 -1.0  1.0 -1.0  -1.0  -1.0 -1.0  \n",
       "2019-04-30 -1.0  1.0  1.0  1.0  0.0  1.0   1.0  -1.0  1.0  \n",
       "2019-05-31 -1.0  1.0  1.0 -1.0 -1.0 -1.0   1.0   1.0 -1.0  \n",
       "\n",
       "[5 rows x 22 columns]"
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "monthly_forward = excess_over_median(prices=data, binary=True, resample_by='M', lag=True)\n",
    "monthly_forward.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can visualize the distribution of the numerical labels."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Text(0.5, 1.0, 'Distribution of Monthly Forward Return Over Median')"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAYYAAAEWCAYAAABi5jCmAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4xLjEsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy8QZhcZAAAfk0lEQVR4nO3deZwcVbn/8c+XBARMIELCIhBGNCDovYJEQJFFFn8gyKJsghgUjYpcrhdUoiLiHvUq6sX7ExQ1BEEWBSJxYQ0IEiAssmsgRIiJEPbFICQ8949zGrom3dPVk6nuSeb7fr3mNV3bqae6q/upU6fqlCICMzOzmpW6HYCZmQ0uTgxmZlbgxGBmZgVODGZmVuDEYGZmBU4MZmZWMCQTg6QfSfrCAJU1VtIzkobl4RmSPjwQZefyfidpwkCV18Z6vyrpEUn/6PS6m5F0hKRr+pg+oO/9YDfUtnd51Xu/zb8Xm3QzplZWuMQgaa6kRZKelvSEpD9J+pikl7Y1Ij4WEV8pWdZufc0TEQ9ExIiIWDIAsZ8k6cxe5e8ZEVOWtew249gIOA7YIiLWazB9Z0kh6de9xr8pj58xADH05LKGL2tZJdZ1hKQl+Qtb+zul6vVWKe9LL+RtqX0P3trG8i33/YGm5NOSZufv8AOSJkt6RYfWX/l+DZB/L+YMRFlVWeESQ/buiBgJbAxMBo4HTh/olXTiR6tLNgYejYiH+5hnIfA2SWvXjZsA/LXSyKpzXf7C1v6ObreAqvaHZSj3nIgYAYwGrgTOG7io+tbPmH8ATAQ+AIwE9gR2Ac4dwNCAPuNb0fbrfllREwMAEfFkREwDDgYmSHojgKSfS/pqfj1a0sX5qOoxSX+UtJKkqcBY4Df5qOszdUexR0p6ALiiyZHtayXdIOlJSRdJWiuva2dJ8+pjrB2ZSdoD+BxwcF7fn/P0l04X5LhOkPQ3SQ9LOkPSmnlaLY4J+UjrEUmfb/beSFozL78wl3dCLn834FLg1TmOnzcp4nngQuCQXN4w4CDgF73W8zZJN+b34kZJb6ubNkPSVyRdm2t4l0ganSdfnf8/keN4a91y/y3pcUn3S9qzwba9In+W/1Y3bp18FDqm2XvSzvuUpx2RYz9Z0mPASXmerfP09+fPZIs8/GFJF+bX20i6Lu93CySdImmVuvWGpE9Img3MzuN2l3RPfi9PAVRmGyJiMelz2aB++yXtLelWvVyj+Pc8vtG+33Tfza9PknS+pDMlPQUckcedm9+/pyXdKWl8k/d5HHAUcFhEXBcRiyPiTuC9wB6SdpG0naR/5H2tttz+km7Lr1eSNEnSfZIezeuuffeW+u42ebvK7tevl3Rp3s/+IumgumlrS5om6SlJNwCv7bVsSHpdfr2XpFvyvA9KOqluvra+0wNphU4MNRFxAzAP2KHB5OPytDHAuqQf54iIw4EHSLWPERHxrbpldgI2B/5fk1V+APgQ8GpgMelIqFWMvwe+Tj7Ki4g3NZjtiPz3DmATYATQ+5TH24HNgF2BEyVt3mSV/wOsmcvZKcf8wYi4jHSkNj/HcUQfYZ+Rl4P0XtwJzK9NzF/K6aTtXxv4LjBdxaOxQ4EPAusAqwCfyuN3zP9H5Tiuy8PbAn8hHQV/CzhdUuEHMiL+BfwSeH/d6PcBl0XEwj62p5GG71Pd9G2BOTn+rwFXATvXbcOcvFxt+Kr8egnwX3k73kr6vI7qte79cvlb5IT5K+CEvMx9wPZlNiAnnA8AjwKP53FvBn4KfJT02ZwKTJP0ihb7fl/2Bc4HRvHyD+k+pM9iFDCNpffXml2Befm7+pKIeBCYCeweETOBZ0m1iJpDgbPy62NI79lOpO/e48APe62n1XcXWu/XryQdPJ1F+tzfB/yvpDfkWX4IPAesT/od+FAf63o2r2sUsBfwcUn79Zqn7Hd6wAyJxJDNB9ZqMP4F0ge4cUS8EBF/jNYdSJ0UEc9GxKIm06dGxB0R8SzwBeCg+qOcZXAY8N2ImBMRzwCfBQ5RsbbypYhYFBF/Bv4MLJVgciwHA5+NiKcjYi7wHeDwdoKJiD8Ba0najLRzn9Frlr2A2RExNR8Bng3cA7y7bp6fRcRf83t5LrBli9X+LSJ+nNt0ppA+u3UbzDcFOFQvty0dDkzto9zt8pFz7W+7ku/T/Ij4n7x9i0g//LVEsAPwjbrhnfJ0IuKmiJiZl5tL+mHeiaJvRMRjudx3AXdFxPkR8QLwPaDVhQEHSXoCWAR8BDgg1x7Iw6dGxPURsSS3Y/0L2K5FmX25LiIujIgX674b10TEb/PnNZUG+2M2GljQZNqCPB3gbNIPMZJGkt6Xs/O0jwKfj4h5+eDgJOCAXt+PVt/dMvv13sDciPhZ/vxuJiXtA/I+817gxLyeO0j7YrN1zYiI2/N7dlvelt77Qcvv9EAbSolhA+CxBuO/DdwLXCJpjqRJJcp6sI3pfwNW5uUde1m8OpdXX/Zwij+M9T8W/yTVKnobTTo6713WBv2IaSpwNKkWc0GLeButp0y89V6aPyL+mV8utUxEXE86GttJ0uuB15GOWJuZGRGj6v5mUu596r0vXAXsIGk9YBhwDrC9pB5SzeNWAEmbKp3C/Ec+9fJ1lt5H6st+df1wPnhptR+eGxGjSPvHHcDWddM2Bo6rT4bARnk9/dUont6f76pqfH7/EVKSb2T9PB3SUfp7lBqk3wPcHBG1z2dj4IK67bmbVDOr/360es9q+tqvNwa27fXeHQasRzrzMJylfwMakrStpCuVTlU+CXyMpfeDdr8jy2xIJAZJbyF9mZe61DEfCR4XEZuQjmSPlbRrbXKTIlvVKDaqez2WVCt5hPRDtXpdXMNIO1LZcueTdsr6shcDD7VYrrdHcky9y/p7m+VA+gIdBfy27oe6pne87axnILr9nUI6nXQ4cH5EPNfm8mXep0KcEXEv6ct7DHB1RDxN+mJPJB09v5hn/f+k2tO4iFiDdAqzd5tBfdkLqNuv8umzjSghIh4hHU2fJKn24/sg8LVeyXD1XKtbartove82WqYdVwAbSdqmfqTSFXLbAZfnbbmL9EO7J8XTSLVt2rPXNq0aEU0/rz70tV8/CFzVaz0jIuLjpMbrxSz9G9DMWaQDlo0iYk3gR5RsO6rSCp0YJK0haW/SOc4zI+L2BvPsLel1+Yv2FOkIo3bp6UOkc8vter+kLSStDnyZ9KO0hHRlw6q5wWll0vni+kvxHgJ66k5/9HY28F+SXiNpBC+3SSxuMn9DOZZzga9JGilpY+BY4My+l2xY1v2kqm+jRrHfAptKOlTScEkHA1sAF5coeiHwIv17/2umAvuTkkPv0wEtLcP7dBXpaLPWnjCj1zCkq26eAp7JNZqPtyhzOvAGSe/JR9zHkI5Qy27LPcAfgM/kUT8GPpaPWCXplXm/HJmn9973W+27yyQi/kr6UfxF7TRePmf/K1Lb0GV1s59F2v4dKV5p9SPSZ7UxgKQxkvbtZzx97dcXk/brwyWtnP/eImnzvM/8mpSEV1e68KCv+5BGAo9FxHM5KR7an3gH2oqaGH4j6WlSZv88qdHzg03mHQdcBjwDXAf8b0TMyNO+AZyQq4ufarJ8I1OBn5OOFFcl7cRExJOko5CfkI46nyU1fNfUdvJHJd3coNyf5rKvBu4nNXD9Rxtx1fuPvP45pJrUWbn8tkXENRExv8H4R0nnY48jNXx+Btg7H8G2KvOfpMbca2vn/PsR1zzgZtJR4h/bXT7rz/t0FekLf3WTYUiN7IcCT5N+pM/pq8D8nh1Iuvz6UdJ+e20b2wHptOlESetExCxSO8MppEbae0kXNtQU9v0S++5AODqXfybp+/h7UlJ9b6/5ziY18F/Ra1/6Puno+5L8/Z9Jarzvlz7266eBd5KuXJpP+p5/k5cT5dGk0z3/IP0O/KyP1RwFfDnHeyIVXJrbH2rdzmq2/JL0U1ID8QndjsVsebGi3qBlRm7wfQ+wVXcjMVu+rKinkmyIk/QV0pU4387ni82sJJ9KMjOzgkpPJUmaS2pcWwIsjojx+W7Yc4AeYC5wUEQ8XmUcZmZWXqU1hpwYxtdfOSDpW6TLsybnm8leFRHH91XO6NGjo6enp7I4zcxWRDfddNMjEdFW/2DQncbnfXm5L5kppMvR+kwMPT09zJo1q9qozMxWMJKa3nXdl6obn4N0TfFNkibmcetGxAKA/H+dRgtKmihplqRZCxe22++ZmZn1V9U1hu0jYr6kdYBLJd1TdsGIOA04DWD8+PFuITcz65BKawy1uwYjPfDlAmAb4KFafy35f18PgzEzsw6rLDHkvldG1l6TbiG/g3TLeq3vkAnARVXFYGZm7avyVNK6pC5wa+s5KyJ+L+lG4FxJR5IeBnJghTGYmVmbKksMkR52vdQDJXLHarsuvYSZmQ0G7hLDzMwKnBjMzKzAicHMzArc7bYNGT2TpjccP3fyXh2OxGxwc43BzMwKnBjMzKzAicHMzAqcGMzMrMCJwczMCpwYzMyswInBzMwKnBjMzKzAicHMzAqcGMzMrMCJwczMCpwYzMyswInBzMwKnBjMzKzAicHMzAqcGMzMrMCJwczMCpwYzMyswInBzMwKnBjMzKzAicHMzAqcGMzMrMCJwczMCpwYzMyswInBzMwKhnc7ALNu65k0veH4uZP36nAkZoODawxmZlbgxGBmZgVODGZmVuDEYGZmBZUnBknDJN0i6eI8/BpJ10uaLekcSatUHYOZmZXXiRrDfwJ31w1/Ezg5IsYBjwNHdiAGMzMrqdLEIGlDYC/gJ3lYwC7A+XmWKcB+VcZgZmbtqbrG8D3gM8CLeXht4ImIWJyH5wEbNFpQ0kRJsyTNWrhwYcVhmplZTWWJQdLewMMRcVP96AazRqPlI+K0iBgfEePHjBlTSYxmZra0Ku983h7YR9K7gFWBNUg1iFGShudaw4bA/ApjMDOzNlVWY4iIz0bEhhHRAxwCXBERhwFXAgfk2SYAF1UVg5mZta8b9zEcDxwr6V5Sm8PpXYjBzMya6EgnehExA5iRX88BtunEes2WhTvXs6HKdz6bmVmBE4OZmRU4MZiZWYETg5mZFTgxmJlZgRODmZkVODGYmVmBE4OZmRU4MZiZWYETg5mZFTgxmJlZgRODmZkVODGYmVmBE4OZmRU4MZiZWYETg5mZFTgxmJlZgRODmZkVODGYmVmBE4OZmRU4MZiZWYETg5mZFTgxmJlZgRODmZkVODGYmVmBE4OZmRU4MZiZWYETg5mZFTgxmJlZwfBuB2DWXz2TpjccP3fyXh2OxGzF4hqDmZkVODGYmVmBE4OZmRU4MZiZWUGpxCDpje0WLGlVSTdI+rOkOyV9KY9/jaTrJc2WdI6kVdot28zMqlO2xvCj/CN/lKRRJZf5F7BLRLwJ2BLYQ9J2wDeBkyNiHPA4cGTbUZuZWWVKJYaIeDtwGLARMEvSWZJ2b7FMRMQzeXDl/BfALsD5efwUYL/+BG5mZtUofR9DRMyWdAIwC/gBsJUkAZ+LiF83WkbSMOAm4HXAD4H7gCciYnGeZR6wQZNlJwITAcaOHVs2TLOm9zeYWTll2xj+XdLJwN2kI/53R8Tm+fXJzZaLiCURsSWwIbANsHmj2Zose1pEjI+I8WPGjCkTppmZDYCyNYZTgB+TageLaiMjYn6uRfQpIp6QNAPYDhglaXiuNWwIzG8/bDMzq0rZxud3AWfVkoKklSStDhARUxstIGlMraFa0mrAbqQax5XAAXm2CcBF/Q/fzMwGWtnEcBmwWt3w6nlcX9YHrpR0G3AjcGlEXAwcDxwr6V5gbeD09kI2M7MqlT2VtGrdFUZExDO1GkMzEXEbsFWD8XNI7Q1mZjYIla0xPCvpzbUBSVsDi/qY38zMllNlawyfBM6TVGsoXh84uJqQzMysm0olhoi4UdLrgc0AAfdExAuVRmZmZl3RzoN63gL05GW2kkREnFFJVGZm1jWlEoOkqcBrgVuBJXl0AE4MZmYrmLI1hvHAFhHR8C5lMzNbcZS9KukOYL0qAzEzs8GhbI1hNHCXpBtI3WkDEBH7VBKVmZl1TdnEcFKVQZiZ2eBR9nLVqyRtDIyLiMvyXc/Dqg3NzMy6oWy32x8hPVzn1DxqA+DCqoIyM7PuKdv4/Alge+ApSA/tAdapKigzM+uesonhXxHxfG1A0nCaPGDHzMyWb2UTw1WSPgeslp/1fB7wm+rCMjOzbimbGCYBC4HbgY8CvwVaPrnNzMyWP2WvSnqR9GjPH1cbjpmZdVvZvpLup0GbQkRsMuARmZlZV7XTV1LNqsCBwFoDH46ZmXVbqTaGiHi07u/vEfE9YJeKYzMzsy4oeyrpzXWDK5FqECMricjMzLqq7Kmk79S9XgzMBQ4a8GjMzKzryl6V9I6qAzEzs8Gh7KmkY/uaHhHfHZhwzMys29q5KuktwLQ8/G7gauDBKoIyM7PuaedBPW+OiKcBJJ0EnBcRH64qMDMz646yXWKMBZ6vG34e6BnwaMzMrOvK1himAjdIuoB0B/T+wBmVRWVmZl1T9qqkr0n6HbBDHvXBiLilurDMzKxbyp5KAlgdeCoivg/Mk/SaimIyM7MuKvtozy8CxwOfzaNWBs6sKigzM+uesjWG/YF9gGcBImI+7hLDzGyFVDYxPB8RQe56W9IrqwvJzMy6qWxiOFfSqcAoSR8BLsMP7TEzWyGVvSrpv/Oznp8CNgNOjIhLK43MbJDqmTS94fi5k/fqcCRm1WiZGCQNA/4QEbsBpZOBpI1I9zqsB7wInBYR35e0FnAO6Qa5ucBBEfF4+6GbmVkVWp5KioglwD8lrdlm2YuB4yJic2A74BOStgAmAZdHxDjg8jxsZmaDRNk7n58Dbpd0KfnKJICIOKbZAhGxAFiQXz8t6W5gA2BfYOc82xRgBulSWDMzGwTKJobp+a9fJPUAWwHXA+vmpEFELJC0Tn/LNTOzgddnYpA0NiIeiIgp/V2BpBHAr4BPRsRTksouNxGYCDB27Nj+rt6sY9wobSuKVm0MF9ZeSPpVu4VLWpmUFH4REb/Oox+StH6evj7wcKNlI+K0iBgfEePHjBnT7qrNzKyfWiWG+sP7TdopWKlqcDpwd68nvE0DJuTXE4CL2inXzMyq1aqNIZq8LmN74HBSo/WtedzngMmkG+aOBB4ADmyzXDMzq1CrxPAmSU+Rag6r5dfk4YiINZotGBHXUKxx1Nu17UjNzKwj+kwMETGsU4GYmdng0M7zGMzMbAhwYjAzswInBjMzK3BiMDOzAicGMzMrcGIwM7MCJwYzMytwYjAzswInBjMzK3BiMDOzAicGMzMrcGIwM7MCJwYzMytwYjAzswInBjMzK2j1oB6zAdczaXrD8XMn79XhSMysEdcYzMyswInBzMwKnBjMzKzAbQw2aLjtwWxwcI3BzMwKnBjMzKzAicHMzAqcGMzMrMCNz2Zd4sZ2G6xcYzAzswInBjMzK3BiMDOzAicGMzMrcGIwM7MCJwYzMytwYjAzswLfx2BWsWb3K5gNVpXVGCT9VNLDku6oG7eWpEslzc7/X1XV+s3MrH+qPJX0c2CPXuMmAZdHxDjg8jxsZmaDSGWJISKuBh7rNXpfYEp+PQXYr6r1m5lZ/3S68XndiFgAkP+v0+H1m5lZC4P2qiRJEyXNkjRr4cKF3Q7HzGzI6HRieEjS+gD5/8PNZoyI0yJifESMHzNmTMcCNDMb6jqdGKYBE/LrCcBFHV6/mZm1UOXlqmcD1wGbSZon6UhgMrC7pNnA7nnYzMwGkcpucIuI9zWZtGtV67QVk28QM+usQdv4bGZm3eHEYGZmBU4MZmZW4E70rMAPqDcz1xjMzKzAicHMzAqcGMzMrMCJwczMCtz4bJXxjWn909f75osArBNcYzAzswInBjMzK3BiMDOzArcxmC1HfAOidYJrDGZmVuDEYGZmBU4MZmZW4DYGsxWA2x5sILnGYGZmBU4MZmZW4MRgZmYFTgxmZlbgxmcrxY2bZkOHawxmZlbgxGBmZgVODGZmVuA2BlsmfhjP0OA2pqHFNQYzMytwYjAzswInBjMzK3BiMDOzAicGMzMrcGIwM7MCJwYzMyvwfQz0fS1+1ddpt3t9uK8nt3a0e59Ju/uR98eXVf1edPK9do3BzMwKupIYJO0h6S+S7pU0qRsxmJlZYx1PDJKGAT8E9gS2AN4naYtOx2FmZo11o8awDXBvRMyJiOeBXwL7diEOMzNrQBHR2RVKBwB7RMSH8/DhwLYRcXSv+SYCE/PgZsBfOhpoMhp4pAvr7SZv89DgbR4aNouIke0u1I2rktRg3FLZKSJOA06rPpzmJM2KiPHdjKHTvM1Dg7d5aJA0qz/LdeNU0jxgo7rhDYH5XYjDzMwa6EZiuBEYJ+k1klYBDgGmdSEOMzNroOOnkiJisaSjgT8Aw4CfRsSdnY6jpK6eyuoSb/PQ4G0eGvq1zR1vfDYzs8HNdz6bmVmBE4OZmRU4MdSRtJakSyXNzv9f1ce8a0j6u6RTOhnjQCuzzZK2lHSdpDsl3Sbp4G7EuqxadcUi6RWSzsnTr5fU0/koB1aJbT5W0l35c71c0sbdiHMgle1yR9IBkkLScn0Ja5ntlXRQ/pzvlHRWy0Ijwn/5D/gWMCm/ngR8s495vw+cBZzS7bir3mZgU2Bcfv1qYAEwqtuxt7mdw4D7gE2AVYA/A1v0muco4Ef59SHAOd2OuwPb/A5g9fz640Nhm/N8I4GrgZnA+G7HXfFnPA64BXhVHl6nVbmuMRTtC0zJr6cA+zWaSdLWwLrAJR2Kq0ottzki/hoRs/Pr+cDDwJiORTgwynTFUv9enA/sKqnRDZnLi5bbHBFXRsQ/8+BM0n1Fy7OyXe58hXRQ9Fwng6tAme39CPDDiHgcICIeblWoE0PRuhGxACD/X6f3DJJWAr4DfLrDsVWl5TbXk7QN6cjkvg7ENpA2AB6sG56XxzWcJyIWA08Ca3ckumqU2eZ6RwK/qzSi6rXcZklbARtFxMWdDKwiZT7jTYFNJV0raaakPVoVOuQe1CPpMmC9BpM+X7KIo4DfRsSDy8vB5ABsc62c9YGpwISIeHEgYuugMl2xlOquZTlSenskvR8YD+xUaUTV63Ob84HdycARnQqoYmU+4+Gk00k7k2qEf5T0xoh4olmhQy4xRMRuzaZJekjS+hGxIP8INqpyvRXYQdJRwAhgFUnPRMSgfa7EAGwzktYApgMnRMTMikKtUpmuWGrzzJM0HFgTeKwz4VWiVPczknYjHSTsFBH/6lBsVWm1zSOBNwIz8oHdesA0SftERL/6Feqysvv1zIh4Abhf0l9IieLGZoX6VFLRNGBCfj0BuKj3DBFxWESMjYge4FPAGYM5KZTQcptz1yUXkLb1vA7GNpDKdMVS/14cAFwRubVuOdVym/NplVOBfcqce14O9LnNEfFkRIyOiJ78HZ5J2vblMSlAuf36QtJFBkgaTTq1NKevQp0YiiYDu0uaDeyeh5E0XtJPuhpZdcps80HAjsARkm7Nf1t2J9z+yW0Gta5Y7gbOjYg7JX1Z0j55ttOBtSXdCxxLukpruVVym79Nqvmelz/X5brfspLbvMIoub1/AB6VdBdwJfDpiHi0r3LdJYaZmRW4xmBmZgVODGZmVuDEYGZmBU4MZmZW4MRgZmYFTgzWNZKW5Esk75D0G0mjWsw/Kt9YWGVMb5B0haS/5h5nv1BFf0mSenLPnl+pGzda0gtqs8deSXPz9elI+tNAx2pDjxODddOiiNgyIt5IusP4Ey3mH0XqkqQtkoaVnG810s1BkyNiU+BNwNv6s84GZTfqZWAOsHfd8IHAMj3mNiLetizLm4ETgw0e11HX+ZekT0u6MT8n4Et59GTgtbmW8W1JO0u6uG6ZUyQdkV/PlXSipGuAAyXNkPRNSTfk2sAODWI4FLg2Ii4ByL2OHg1MkrRSLnNU3frulbSupDGSfpXjvVHS9nn6SZJOk3QJcEaD9S0C7tbLzwM4GDi3rvxm5a4t6RJJt0g6lbr+ciQ9k/+PUHq+ws2Sbpe0bx7fI+luST9W6pv/kpwQzV7ixGBdl4/odyXfyi/pnaS+XLYBtgS2lrQj6U7k+3Ito0zvts9FxNsj4pd5eHhEbAN8Evhig/nfANxUPyIi7iPdGTyC1F3I/jnGbYG5EfEQ6dkcJ0fEW4D3AvV3yW8N7BsRhzaJ8ZfAIZI2BJZQ7OemWblfBK6JiK1I79nYRtsO7B8RbyZ1h/CdulNi40jdML8BeCKXbfaSIdeJng0qq0m6Fegh/SBfmse/M//dkodHkH7MHmiz/HN6Df86/78pr7M30bw31cjlnQj8jPwgnzxtN2CLuqaINSSNzK+nRcSiPmL8PenZAA81iLdZuTsC7wGIiOmSHm+yLV/PCfVFUm1s3Tzt/oi4Nb9u9l7YEObEYN20KCK2lLQmcDGpjeEHpB+1b0TEqfUza+lHbS6mWOtdtdf0Z3sN13oOXULjff9O0o9u/To3AZ6JiKclXQe8TtIY0gONvppnWwl4a+8EkH/Qe8dQEBHPS7oJOI5UY3l33eS+ym3Vl81hpIcpbR0RL0iay8vvT30PqksAn0qyAp9Ksq6LiCeBY4BPSVqZ1OnXhySNAJC0gaR1gKdJ3SbX/I10RP2KnFx2XcZQfgG8Xakb6lpj9A9IT/oi97R6AfBd4O66jsguIbVFkJdrt4PB7wDHN+jYrFm5V5N++JG0J9Do2eRrAg/npPAOYLl/lrN1jhODDQoRcQvpebWH5Mbfs4DrJN1OeszmyPzDeW2+vPXbEfEgqbH2NtKP+i1Nii8bwyLSYxFPUOqz/nZSt8b1l4+eA7yf4mmfY4DxuaH8LuBjba73zoiY0mBSs3K/BOwo6WbSKbdGp9h+kZedRUoi97QTkw1t7l3VzMwKXGMwM7MCJwYzMytwYjAzswInBjMzK3BiMDOzAicGMzMrcGIwM7OC/wOc8l8+QJzcCwAAAABJRU5ErkJggg==\n",
      "text/plain": [
       "<Figure size 432x288 with 1 Axes>"
      ]
     },
     "metadata": {
      "needs_background": "light"
     },
     "output_type": "display_data"
    }
   ],
   "source": [
    "excess3 = excess_over_median(prices=data, binary=False, resample_by='M', lag=True)\n",
    "\n",
    "s = pd.Series(excess3.values.flatten())\n",
    "ax = s.plot.hist(bins=50)\n",
    "ax.set_xlim(-0.5,0.6)\n",
    "ax.set_xlabel('Return Over Median')\n",
    "ax.set_title('Distribution of Monthly Forward Return Over Median')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "---\n",
    "## Conclusion"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This notebook presents the method to label data according to excess return over median. This method can return either numerical or categorical labels for observations. Zhu et al. utilize these labels to predict monthly stock returns using linear regression and decision trees based on composite features as independent variables. In this process:\n",
    " - Returns are obtained from stock prices, and optionally lagged.\n",
    " - At each time index, the median rate of return for all stocks is calculated. The median is subtracted from each stock's return to find the excess return over median.\n",
    " - If categorical labels are desired, the excess returns are converted to their signs.\n",
    "\n",
    "This method is useful for labeling data used for training regression and classification models. The user can specify for the data to be labeled on a daily, weekly, monthly etc., basis, and whether the returns are lagged."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## References"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "1. Zhu, M., Philpotts, D. and Stevenson, M., 2012. The benefits of tree-based models for stock selection. Journal of Asset Management, [online] 13(6), pp.437-448. Available at: <https://link.springer.com/article/10.1057/jam.2012.17>."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.4"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
